45 research outputs found

    GERNERMED++: Transfer Learning in German Medical NLP

    Full text link
    We present a statistical model for German medical natural language processing trained for named entity recognition (NER) as an open, publicly available model. The work serves as a refined successor to our first GERNERMED model which is substantially outperformed by our work. We demonstrate the effectiveness of combining multiple techniques in order to achieve strong results in entity recognition performance by the means of transfer-learning on pretrained deep language models (LM), word-alignment and neural machine translation. Due to the sparse situation on open, public medical entity recognition models for German texts, this work offers benefits to the German research community on medical NLP as a baseline model. Since our model is based on public English data, its weights are provided without legal restrictions on usage and distribution. The sample code and the statistical model is available at: https://github.com/frankkramer-lab/GERNERMED-p

    GERNERMED++: semantic annotation in German medical NLP through transfer-learning, translation and word alignment

    Get PDF
    We present a statistical model, GERNERMED++, for German medical natural language processing trained for named entity recognition (NER) as an open, publicly available model. We demonstrate the effectiveness of combining multiple techniques in order to achieve strong results in entity recognition performance by the means of transfer-learning on pre-trained deep language models (LM), word-alignment and neural machine translation, outperforming a pre-existing baseline model on several datasets. Due to the sparse situation of open, public medical entity recognition models for German texts, this work offers benefits to the German research community on medical NLP as a baseline model. The work serves as a refined successor to our first GERNERMED model. Similar to our previous work, our trained model is publicly available to other researchers. The sample code and the statistical model is available at: https://github.com/frankkramer-lab/GERNERMED-p

    GERNERMED: an open German medical NER model

    Get PDF
    The current state of adoption of well-structured electronic health records and integration of digital methods for storing medical patient data in structured formats can often considered as inferior compared to the use of traditional, unstructured text based patient data documentation. Data mining in the field of medical data analysis often needs to rely solely on processing of unstructured data to retrieve relevant data. In natural language processing (NLP), statistical models have been shown successful in various tasks like part-of-speech tagging, relation extraction (RE) and named entity recognition (NER). In this work, we present GERNERMED, the first open, neural NLP model for NER tasks dedicated to detect medical entity types in German text data. Here, we avoid the conflicting goals of protection of sensitive patient data from training data extraction and the publication of the statistical model weights by training our model on a custom dataset that was translated from publicly available datasets in foreign language by a pretrained neural machine translation model. The sample code and the statistical model is available at: https://github.com/frankkramer-lab/GERNERME

    Annotated dataset creation through large language models for non-english medical NLP

    Get PDF
    Obtaining text datasets with semantic annotations is an effortful process, yet crucial for supervised training in natural language processing (NLP). In general, developing and applying new NLP pipelines in domain-specific contexts for tasks often requires custom-designed datasets to address NLP tasks in a supervised machine learning fashion. When operating in non-English languages for medical data processing, this exposes several minor and major, interconnected problems such as the lack of task-matching datasets as well as task-specific pre-trained models. In our work, we suggest to leverage pre-trained large language models for training data acquisition in order to retrieve sufficiently large datasets for training smaller and more efficient models for use-case-specific tasks. To demonstrate the effectiveness of your approach, we create a custom dataset that we use to train a medical NER model for German texts, GPTNERMED, yet our method remains language-independent in principle. Our obtained dataset as well as our pre-trained models are publicly available at https://github.com/frankkramer-lab/GPTNERMED

    GERNERMED: an open German medical NER model

    Get PDF

    Perspective on code submission and automated evaluation platforms for university teaching

    Get PDF
    We present a perspective on platforms for code submission and automated evaluation in the context of university teaching. Due to the COVID-19 pandemic, such platforms have become an essential asset for remote courses and a reasonable standard for structured code submission concerning increasing numbers of students in computer sciences. Utilizing automated code evaluation techniques exhibits notable positive impacts for both students and teachers in terms of quality and scalability. We identified relevant technical and non-technical requirements for such platforms in terms of practical applicability and secure code submission environments. Furthermore, a survey among students was conducted to obtain empirical data on general perception. We conclude that submission and automated evaluation involves continuous maintenance yet lowers the required workload for teachers and provides better evaluation transparency for students.Comment: Source code: https://github.com/frankkramer-lab/MISITcms-app. Manuscript accepted for publication in the following Conference - Jorunal: MedInfo 2021, Virtual Conference, October 2-4, 2021 - IOS Pres

    IE-Vnet: deep learning-based segmentation of the inner ear's total fluid space

    Get PDF
    Background In-vivo MR-based high-resolution volumetric quantification methods of the endolymphatic hydrops (ELH) are highly dependent on a reliable segmentation of the inner ear's total fluid space (TFS). This study aimed to develop a novel open-source inner ear TFS segmentation approach using a dedicated deep learning (DL) model. Methods The model was based on a V-Net architecture (IE-Vnet) and a multivariate (MR scans: T1, T2, FLAIR, SPACE) training dataset (D1, 179 consecutive patients with peripheral vestibulocochlear syndromes). Ground-truth TFS masks were generated in a semi-manual, atlas-assisted approach. IE-Vnet model segmentation performance, generalizability, and robustness to domain shift were evaluated on four heterogenous test datasets (D2-D5, n = 4 × 20 ears). Results The IE-Vnet model predicted TFS masks with consistently high congruence to the ground-truth in all test datasets (Dice overlap coefficient: 0.9 ± 0.02, Hausdorff maximum surface distance: 0.93 ± 0.71 mm, mean surface distance: 0.022 ± 0.005 mm) without significant difference concerning side (two-sided Wilcoxon signed-rank test, p>0.05), or dataset (Kruskal-Wallis test, p>0.05; post-hoc Mann-Whitney U, FDR-corrected, all p>0.2). Prediction took 0.2 s, and was 2,000 times faster than a state-of-the-art atlas-based segmentation method. Conclusion IE-Vnet TFS segmentation demonstrated high accuracy, robustness toward domain shift, and rapid prediction times. Its output works seamlessly with a previously published open-source pipeline for automatic ELS segmentation. IE-Vnet could serve as a core tool for high-volume trans-institutional studies of the inner ear. Code and pre-trained models are available free and open-source under https://github.com/pydsgz/IEVNet

    The Chronic CARe for diAbeTes study (CARAT): a cluster randomized controlled trial

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Diabetes is a major challenge for the health care system and especially for the primary care provider. The Chronic Care Model represents an evidence-based framework for the care for chronically ill. An increasing number of studies showed that implementing elements of the Chronic Care Model improves patient relevant outcomes and process parameters. However, most of these findings have been performed in settings different from the Swiss health care system which is dominated by single handed practices.</p> <p>Methods/Design</p> <p>CARAT is a cluster randomized controlled trial with general practitioners as the unit of randomization (trial registration: ISRCTN05947538). The study challenges the hypothesis that implementing several elements of the Chronic Care Model via a specially trained practice nurse improves the HbA1c level of diabetes type II patients significantly after one year (primary outcome). Furthermore, we assume that the intervention increases the proportion of patients who achieve the recommended targets regarding blood pressure (<130/80), HbA1c (=<6.5%) and low-density lipoprotein-cholesterol (<2.6 mmol/l), increases patients' quality of life (SF-36) and several evidence-based quality indicators for diabetes care. These improvements in care will be experienced by the patients (PACIC-5A) as well as by the practice team (ACIC). According to the power calculation, 28 general practitioners will be randomized either to the intervention group or to the control group. Each general practitioner will include 12 patients suffering from diabetes type II. In the intervention group the general practitioner as well as the practice nurse will be trained to perform care for diabetes patients according to the Chronic Care Model in teamwork. In the control group no intervention will be applied at all and patients will be treated as usual. Measurements (pre-data-collection) will take place in months II-IV, starting in February 2010. Follow-up data will be collected after 1 year.</p> <p>Discussion</p> <p>This study challenges the hypothesis that the Chronic Care Model can be easily implemented by a practice nurse focused approach. If our results will confirm this hypothesis the suggestion arises whether this approach should be implemented in other chronic diseases and multimorbid patients and how to redesign care in Switzerland.</p

    Analyse leitungsgeführter Emissionen im HV-Bordnetz von elektrischen Fahrzeugen

    Get PDF
    Mit der fortschreitenden Elektrifizierung des Antriebsstrangs und der steigenden Zahl an Hochvolt-Nebenaggregaten wird es immer wichtiger, genaue Kenntnisse über die Nutzund Störgrößenverhältnisse im HV-Bordnetz zu erhalten und deren Einflussfaktoren zu analysieren [1] [2] [3]. Hierbei stellen die im Pulsbetrieb operierenden Komponenten, wegen der schnellen Schaltflanken bei teils sehr hohen Leistungen, das größte Störpotential dar. Daher ist es geboten, die hier erzeugten Störemissionen der Komponente möglichst schon im Design durch geeignete Schaltungen und taktbezogene Modulationsverfahren zu minimieren [4] [5] [6]. Allerdings genügt es wegen der Komplexität des Gesamtsystems Fahrzeugbordnetz nicht mehr, die verschiedenen Komponenten nur isoliert voneinander zu betrachten und im Laboraufbau zu analysieren. Für die Festlegung von sinnvollen Komponentenanforderungen ist es wichtig, die verschiedenen Komponenten und Subsysteme auch im Systemverbund zu analysieren. Damit ist es möglich, die relevanten Störgrößen oder potentielle Wechselwirkungen zu identifizieren und entsprechende Komponentenanforderungen zu definieren. Vergleichsweise einfache Laboraufbauten mit einer Minimalperipherie, wie sie in EMV-Messungen zum Einsatz kommen, sind für die Überwachung der Konformität sehr gut geeignet. Für die Festlegung von Spezifikationen, das grundsätzliche Verständnis von Wechselwirkungen und der Wirksamkeit verschiedener Komponentenmaßnahmen sind Untersuchung der gesamten HV-Bordnetzarchitektur unter Einbeziehung möglichst realistischer Belastungszustände sinnvoll. Damit ist es möglich, relevante Wechselwirkungen im Gesamtsystem zu erfassen und die Wirksamkeit von Komponentenmaßnahmen zu bewerten. Da sowohl ein Systemverbund als auch die möglichen Belastungszustände sehr vielfältig sein können, bietet es sich an, solche grundsätzlichen Analysen von Störungen und Entstörungsmaßnahmen simulationsunterstützt durchzuführen
    corecore